ensemble algorithm
Interpretable Clustering Ensemble
Lv, Hang, Hu, Lianyu, Jiang, Mudi, Liu, Xinying, He, Zengyou
--Clustering ensemble has emerged as an important research topic in the field of machine learning. Although numerous methods have been proposed to improve clustering quality, most existing approaches overlook the need for interpretability in high-stakes applications. In domains such as medical diagnosis and financial risk assessment, algorithms must not only be accurate but also interpretable to ensure transparent and trustworthy decision-making. Therefore, to fill the gap of lack of interpretable algorithms in the field of clustering ensemble, we propose the first interpretable clustering ensemble algorithm in the literature. By treating base partitions as categorical variables, our method constructs a decision tree in the original feature space and use the statistical association test to guide the tree building process. Experimental results demonstrate that our algorithm achieves comparable performance to state-of-the-art (SOT A) clustering ensemble methods while maintaining an additional feature of interpretability. T o the best of our knowledge, this is the first interpretable algorithm specifically designed for clustering ensemble, offering a new perspective for future research in interpretable clustering. LUSTERING analysis [1] is an unsupervised learning issue in the field of data mining, which aims to partition data into different clusters by exploring its intrinsic structure.
EnsIR: An Ensemble Algorithm for Image Restoration via Gaussian Mixture Models
Image restoration has experienced significant advancements due to the development of deep learning. Nevertheless, it encounters challenges related to ill-posed problems, resulting in deviations between single model predictions and ground-truths. Ensemble learning, as a powerful machine learning technique, aims to address these deviations by combining the predictions of multiple base models. Most existing works adopt ensemble learning during the design of restoration models, while only limited research focuses on the inference-stage ensemble of pre-trained restoration models. Regression-based methods fail to enable efficient inference, leading researchers in academia and industry to prefer averaging as their choice for post-training ensemble.
Adaptive Ensembles of Fine-Tuned Transformers for LLM-Generated Text Detection
Lai, Zhixin, Zhang, Xuesheng, Chen, Suiyao
Large language models (LLMs) have reached human-like proficiency in generating diverse textual content, underscoring the necessity for effective fake text detection to avoid potential risks such as fake news in social media. Previous research has mostly tested single models on in-distribution datasets, limiting our understanding of how these models perform on different types of data for LLM-generated text detection task. We researched this by testing five specialized transformer-based models on both in-distribution and out-of-distribution datasets to better assess their performance and generalizability. Our results revealed that single transformer-based classifiers achieved decent performance on in-distribution dataset but limited generalization ability on out-of-distribution dataset. To improve it, we combined the individual classifiers models using adaptive ensemble algorithms, which improved the average accuracy significantly from 91.8% to 99.2% on an in-distribution test set and from 62.9% to 72.5% on an out-of-distribution test set. The results indicate the effectiveness, good generalization ability, and great potential of adaptive ensemble algorithms in LLM-generated text detection.
Towards a Systematic Approach to Design New Ensemble Learning Algorithms
Mendes-Moreira, Joรฃo, Mendes-Neves, Tiago
Ensemble learning has been a focal point of machine learning research due to its potential to improve predictive performance. This study revisits the foundational work on ensemble error decomposition, historically confined to bias-variance-covariance analysis for regression problems since the 1990s. Recent advancements introduced a "unified theory of diversity," which proposes an innovative bias-variance-diversity decomposition framework. Leveraging this contemporary understanding, our research systematically explores the application of this decomposition to guide the creation of new ensemble learning algorithms. Focusing on regression tasks, we employ neural networks as base learners to investigate the practical implications of this theoretical framework. This approach used 7 simple ensemble methods, we name them strategies, for neural networks that were used to generate 21 new ensemble algorithms. Among these, most of the methods aggregated with the snapshot strategy, one of the 7 strategies used, showcase superior predictive performance across diverse datasets w.r.t. the Friedman rank test with the Conover post-hoc test. Our systematic design approach contributes a suite of effective new algorithms and establishes a structured pathway for future ensemble learning algorithm development.
A generalized decision tree ensemble based on the NeuralNetworks architecture: Distributed Gradient Boosting Forest (DGBF)
Delgado-Panadero, รngel, Benรญtez-Andrades, Josรฉ Alberto, Garcรญa-Ordรกs, Marรญa Teresa
Tree ensemble algorithms as RandomForest and GradientBoosting are currently the dominant methods for modeling discrete or tabular data, however, they are unable to perform a hierarchical representation learning from raw data as NeuralNetworks does thanks to its multi-layered structure, which is a key feature for DeepLearning problems and modeling unstructured data. This limitation is due to the fact that tree algorithms can not be trained with back-propagation because of their mathematical nature. However, in this work, we demonstrate that the mathematical formulation of bagging and boosting can be combined together to define a graph-structured-tree-ensemble algorithm with a distributed representation learning process between trees naturally (without using back-propagation). We call this novel approach Distributed Gradient Boosting Forest (DGBF) and we demonstrate that both RandomForest and GradientBoosting can be expressed as particular graph architectures of DGBT. Finally, we see that the distributed learning outperforms both RandomForest and GradientBoosting in 7 out of 9 datasets.
Improving Reinforcement Learning from Human Feedback with Efficient Reward Model Ensemble
Zhang, Shun, Chen, Zhenfang, Chen, Sunli, Shen, Yikang, Sun, Zhiqing, Gan, Chuang
Reinforcement Learning from Human Feedback (RLHF) is a widely adopted approach for aligning large language models with human values. However, RLHF relies on a reward model that is trained with a limited amount of human preference data, which could lead to inaccurate predictions. As a result, RLHF may produce outputs that are misaligned with human values. To mitigate this issue, we contribute a reward ensemble method that allows the reward model to make more accurate predictions. As using an ensemble of large language model-based reward models can be computationally and resource-expensive, we explore efficient ensemble methods including linear-layer ensemble and LoRA-based ensemble. Empirically, we run Best-of-$n$ and Proximal Policy Optimization with our ensembled reward models, and verify that our ensemble methods help improve the alignment performance of RLHF outputs.
Online Ensemble of Models for Optimal Predictive Performance with Applications to Sector Rotation Strategy
Asset-specific factors are commonly used to forecast financial returns and quantify asset-specific risk premia. Using various machine learning models, we demonstrate that the information contained in these factors leads to even larger economic gains in terms of forecasts of sector returns and the measurement of sector-specific risk premia. To capitalize on the strong predictive results of individual models for the performance of different sectors, we develop a novel online ensemble algorithm that learns to optimize predictive performance. The algorithm continuously adapts over time to determine the optimal combination of individual models by solely analyzing their most recent prediction performance. This makes it particularly suited for time series problems, rolling window backtesting procedures, and systems of potentially black-box models. We derive the optimal gain function, express the corresponding regret bounds in terms of the out-of-sample R-squared measure, and derive optimal learning rate for the algorithm. Empirically, the new ensemble outperforms both individual machine learning models and their simple averages in providing better measurements of sector risk premia. Moreover, it allows for performance attribution of different factors across various sectors, without conditioning on a specific model. Finally, by utilizing monthly predictions from our ensemble, we develop a sector rotation strategy that significantly outperforms the market. The strategy remains robust against various financial factors, periods of financial distress, and conservative transaction costs. Notably, the strategy's efficacy persists over time, exhibiting consistent improvement throughout an extended backtesting period and yielding substantial profits during the economic turbulence of the COVID-19 pandemic.
Towards Practical Application of Deep Learning in Diagnosis of Alzheimer's Disease
Accurate diagnosis of Alzheimer's disease (AD) is both challenging and time consuming. With a systematic approach for early detection and diagnosis of AD, steps can be taken towards the treatment and prevention of the disease. This study explores the practical application of deep learning models for diagnosis of AD. Due to computational complexity, large training times and limited availability of labelled dataset, a 3D full brain CNN (convolutional neural network) is not commonly used, and researchers often prefer 2D CNN variants. In this study, full brain 3D version of well-known 2D CNNs were designed, trained and tested for diagnosis of various stages of AD. Deep learning approach shows good performance in differentiating various stages of AD for more than 1500 full brain volumes. Along with classification, the deep learning model is capable of extracting features which are key in differentiating the various categories. The extracted features align with meaningful anatomical landmarks, that are currently considered important in identification of AD by experts. An ensemble of all the algorithm was also tested and the performance of the ensemble algorithm was superior to any individual algorithm, further improving diagnosis ability. The 3D versions of the trained CNNs and their ensemble have the potential to be incorporated in software packages that can be used by physicians/radiologists to assist them in better diagnosis of AD.
Context-Aware Ensemble Learning for Time Series
Fazla, Arda, Aydin, Mustafa Enes, Tamyigit, Orhun, Kozat, Suleyman Serdar
We investigate ensemble methods for prediction in an online setting. Unlike all the literature in ensembling, for the first time, we introduce a new approach using a meta learner that effectively combines the base model predictions via using a superset of the features that is the union of the base models' feature vectors instead of the predictions themselves. Here, our model does not use the predictions of the base models as inputs to a machine learning algorithm, but choose the best possible combination at each time step based on the state of the problem. We explore three different constraint spaces for the ensembling of the base learners that linearly combines the base predictions, which are convex combinations where the components of the ensembling vector are all nonnegative and sum up to 1; affine combinations where the weight vector components are required to sum up to 1; and the unconstrained combinations where the components are free to take any real value. The constraints are both theoretically analyzed under known statistics and integrated into the learning procedure of the meta learner as a part of the optimization in an automated manner. To show the practical efficiency of the proposed method, we employ a gradient-boosted decision tree and a multi-layer perceptron separately as the meta learners. Our framework is generic so that one can use other machine learning architectures as the ensembler as long as they allow for a custom differentiable loss for minimization. We demonstrate the learning behavior of our algorithm on synthetic data and the significant performance improvements over the conventional methods over various real life datasets, extensively used in the well-known data competitions. Furthermore, we openly share the source code of the proposed method to facilitate further research and comparison.